Learning Effective Word Embedding using Morphological Word Similarity
نویسندگان
چکیده
Deep learning techniques aim at obtaining high-quality distributed representations of words, i.e., word embeddings, to address text mining and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context that captures both semantic and syntactic relationships between words. However, it is challenging to handle unseen words or rare words with insufficient context. In this paper, inspired by the study on word recognition process in cognitive psychology, we propose to take advantage of seemingly less obvious but essentially important morphological word similarity to address these challenges. In particular, we introduce a novel neural network architecture that leverages both contextual information and morphological word similarity to learn word embeddings. Meanwhile, the learning architecture is also able to refine the pre-defined morphological knowledge and obtain more accurate word similarity. Experiments on an analogical reasoning task and a word similarity task both demonstrate that the proposed method can greatly enhance the effectiveness of word embeddings.
منابع مشابه
Knowledge-Powered Deep Learning for Word Embedding
The basis of applying deep learning to solve natural language processing tasks is to obtain high-quality distributed representations of words, i.e., word embeddings, from large amounts of text data. However, text itself usually contains incomplete and ambiguous information, which makes necessity to leverage extra knowledge to understand it. Fortunately, text itself already contains welldefined ...
متن کاملA Framework for Learning Knowledge-Powered Word Embedding
Neural network techniques are widely applied to obtain high-quality distributed representations of words, i.e., word embeddings, to address text mining and natural language processing tasks. Recently, efficient methods have been proposed to learn word embeddings from context that captures both semantic and syntactic relationships between words. However, it is challenging to handle unseen words ...
متن کاملMorpheme-Enhanced Spectral Word Embedding
Traditional word embedding models only learn word-level semantic information from corpus while neglect the valuable semantic information of words’ internal structures such as morphemes. To address this problem, the goal of this paper is to exploit the morphological information to enhance the quality of word embeddings. Based on spectral method, we propose two word embedding models: Morpheme on ...
متن کاملCharacter Word Embedding for NLP tasks in Indian Languages
In the recent time Word Embeddings have been used as unsupervised approach to achieve results comparable to that of supervised methods which use handcrafted features.But information about word morphology and shape is normally ignored when learning word representations.Character level embedding can capture the intra-word information specially when dealing with morphologically rich languages. Her...
متن کاملMulti-phase Word Sense Embedding Learning Using a Corpus and a Lexical Ontology
Word embeddings play a significant role in many modern NLP systems. However, most prevalent word embedding learning methods learn one representation per word which is problematic for polysemous words and homonymous words. To address this problem, we propose a multi-phase word sense embedding learning method which utilizes both a corpus and a lexical ontology to learn one embedding per word sens...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1407.1687 شماره
صفحات -
تاریخ انتشار 2014